Exploring the spatial frequency requirements of audio-visual speech using superimposed facial motion

نویسندگان

Douglas M. Shiller

Christian Kroos

Eric Vatikiotis-Bateson

Kevin G. Munhall

چکیده

While visually complex stimuli such as human faces contain information across a wide range of spatial frequencies, information related to specific perceptual judgements may be concentrated in distinct spatial frequency bands. For example, previous work on static face perception has shown that face recognition relies primarily on low spatial frequency information while other tasks, such as identifying facial expressions, may require higher spatial frequencies. An innovative approach to identifying such spatial frequency biases has been the use of hybrid visual stimuli: stimuli that involve the overlap of two distinct images, one of which has been spatially filtered to remove high spatial-frequency information (i.e., low-pass filtered) and another which has been filtered to remove low-frequency information (i.e., high-pass filtered) (Schyns and Oliva, 1999). By placing these two spatial-frequency portions of the image in direct competition with each other, the use of hybrid stimuli allows for the identification of spatial frequency bands that are preferentially processed by the visual system and not merely sufficient for the task.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speech Driven MPEG-4 Facial Animation for Turkish

In this study, a system, that generates visual speech by synthesizing 3D face points, has been implemented. The synthesized face points drive MPEG-4 facial animation. To produce realistic and natural speech animation, a codebook based technique, which is trained with audio-visual data from a speaker, was employed. An audio-visual speech database was created using a 3D facial motion capture syst...

متن کامل

Event-Related Potentials Associated with Somatosensory Effect in Audio-Visual Speech Perception

Speech perception often involves multisensory processing. Although previous studies have demonstrated visual [1, 2] and somatosensory interactions [3, 4] with auditory processing, it is not clear whether somatosensory information can contribute to the processing of audio-visual speech perception. This study explored the neural consequence of somatosensory interactions in audio-visual speech pro...

متن کامل

A comparison of acoustic coding models for speech-driven facial animation

This article presents a thorough experimental comparison of several acoustic modeling techniques by their ability to capture information related to orofacial motion. These models include (1) Linear Predictive Coding and Linear Spectral Frequencies, which model the dynamics of the speech production system, (2) Mel Frequency Cepstral Coefficients and Perceptual Critical Feature Bands, which encod...

متن کامل

Real-time speech-driven face animation with expressions using neural networks

A real-time speech-driven synthetic talking face provides an effective multimodal communication interface in distributed collaboration environments. Nonverbal gestures such as facial expressions are important to human communication and should be considered by speech-driven face animation systems. In this paper, we present a framework that systematically addresses facial deformation modeling, au...

متن کامل

Speaker-independent 3D face synthesis driven by speech and text

In this study, a complete system that generates visual speech by synthesizing 3D face points has been implemented. The estimated face points drive MPEG-4 facial animation. This system is speaker independent and can be driven by audio or both audio and text. The synthesis of visual speech was realized by a codebook-based technique, which is trained with audio-visual data from a speaker. An audio...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2003

Exploring the spatial frequency requirements of audio-visual speech using superimposed facial motion

نویسندگان

چکیده

منابع مشابه

Speech Driven MPEG-4 Facial Animation for Turkish

Event-Related Potentials Associated with Somatosensory Effect in Audio-Visual Speech Perception

A comparison of acoustic coding models for speech-driven facial animation

Real-time speech-driven face animation with expressions using neural networks

Speaker-independent 3D face synthesis driven by speech and text

عنوان ژورنال:

اشتراک گذاری